Bending the Curve: Improving the ROC Curve Through Error Redistribution
نویسندگان
چکیده
Classification performance is often not uniform over the data. Some areas in the input space are easier to classify than others. Features that hold information about the ”difficulty” of the data may be nondiscriminative and are therefore disregarded in the classification process. We propose a meta-learning approach where performance may be improved by post-processing. This improvement is done by establishing a dynamic threshold on the base-classifier results. Since the base-classifier is treated as a “black box” the method presented can be used on any state of the art classifier in order to try an improve its performance. We focus our attention on how to better control the true-positive/false-positive tradeoff known as the ROC curve. We propose an algorithm for the derivation of optimal thresholds by redistributing the error depending on features that hold information about difficulty. We demonstrate the resulting benefit on both synthetic and real-life data.
منابع مشابه
Receiver Operating Characteristic (ROC) Curve Analysis for Medical Diagnostic Test Evaluation
This review provides the basic principle and rational for ROC analysis of rating and continuous diagnostic test results versus a gold standard. Derived indexes of accuracy, in particular area under the curve (AUC) has a meaningful interpretation for disease classification from healthy subjects. The methods of estimate of AUC and its testing in single diagnostic test and also comparative studies...
متن کاملمقایسه مدلهای رگرسیون لجستیک با تحلیل جداسازی در پیشبینی دیابت نوع 2
Background and Objectives: Diabetes is a chronic and common metabolic disease which has no curative treatment. Logistic regression (LR) is a statistical model for the analysis and prediction in multivariate statistical techniques. Discriminant analysis is a method for separating observations in terms of dependent variable levels which can allocate any new observation after making discriminating...
متن کاملCapturing Outlines of Planar Generic Images by Simultaneous Curve Fitting and Sub-division
In this paper, a new technique has been designed to capture the outline of 2D shapes using cubic B´ezier curves. The proposed technique avoids the traditional method of optimizing the global squared fitting error and emphasizes the local control of data points. A maximum error has been determined to preserve the absolute fitting error less than a criterion and it administers the process of curv...
متن کاملNon-parametric estimation of ROC curve
Receiver operating characteristic (ROC) curve is widely applied in measuring discriminatory ability of diagnostic or prognostic tests. This makes ROC analysis one of the most active research areas in medical statistics. Many parametric and semiparametric estimation methods have been proposed for estimating the ROC curve and its functionals. In this paper, we propose a fully nonparametric Bayesi...
متن کاملComparison of Gestational Diabetes Prediction Between Logistic Regression, Discriminant Analysis, Decision Tree and Artificial Neural Network Models
Background and Objectives: Gestational Diabetes Mellitus (GDM) is the most common metabolic disorder in pregnancy. In case of early detection, some of its complications can be prevented. The aim of this study was to investigate early prediction of GDM by logistic regression (LR), discriminant analysis (DA), decision tree (DT) and perceptron artificial neural network (ANN) and to compare these m...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1605.06652 شماره
صفحات -
تاریخ انتشار 2016